OPEN 3 ACCESS Freely available online 



•0-PLOS I o-^E 



DUF581 Is Plant Specific FCS-Like Zinc Finger Involved in 
Protein-Protein Interaction cros^k 

Muhammed Jamsheer K, Ashverya Laxmi* 

National Institute of Plant Genome Research, New Delhi, India 

Abstract 

Zinc fingers are a ubiquitous class of protein domain with considerable variation In structure and function. Zf-FCS Is a highly 
diverged group of C2-C2 zinc finger which is present in animals, prokaryotes and viruses, but not In plants. In this study we 
Identified that a plant specific domain of unknown function, DUF581 Is a zf-FCS type zinc finger. Based on HMM-HMM 
comparison and signature motif similarity we named this domain as FCS-Like Zinc finger (FLZ) domain. A genome wide 
survey identified that FLZ domain containing genes are bryophytic In origin and this gene family Is expanded in 
spermatophytes. Expression analysis of selected FLZ gene family members of A. thaliana identified an overlapping 
expression pattern suggesting a possible redundancy In their function. Unlike the zf-FCS domain, the FLZ domain found to 
be highly conserved In sequence and structure. Using a combination of bioinformatic and protein-protein interaction tools, 
we identified that FLZ domain Is involved In protein-protein Interaction. 
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Introduction 

Identifying gene function and their interaction with other genes 
with respect to the regulation of growth and development is major 
task post genome sequencing. Although Arahidopsis thaliana genome 
sequencing was completed in late 2000, the functions of a large 
number of genes are still unknown [1,2]. According to TAIRIO, 
out of 27,416 protein coding genes in^. thaliana, functions of about 
37% genes are unidentified [2]. To further complicate this issue, 
many uncharacterized and even some functionally characterized 
proteins contain domains whose function is unknown. These 
uncharacterized domains are known as Domains of Unknown 
Functions (DUFs). DUF nomenclature was introduced to record 
and classify the conserved domains which are present in proteins 
while no information about its function was available at that time. 
The number of DUFs is so huge; PFAM release 23.0 include over 
2200 protein families of DUFs which cover almost 22% of the total 
PFAM protein families [3]. It is presumed that majority of DUFs 
are divergent members of the already existing domains and the 
rest can be novel folds. Although the numbers of DUF families are 
increasing in PFAM, the identification of functions of DUF 
domains is slowly gaining momentum. The DUF3233 of gram 
negative gamma proteobacteria found to be trans-membrane IB- 
barrel domain of auto-transporter proteins [4]. The DUF283 of 
Dicer endonuclease is predicted to form a double-stranded liNA- 
binding fold [5]. Later, structural analysis proved that DUF283 
form a noncanonical double-stranded RNA-binding fold and 
functional studies confirmed that it has a weak double strand RNA 
binding activity and a specific protein binding activity [6] . The co- 
ordinated effort of NIH Protein Structure Initiative identified the 



structures of about 250 DUFs and found that majority of them are 
divergent members of the well characterized domains [7] . 

DUF581 is a plant specific domain found in all taxa except 
algae. They are highly conserved across plant kingdom and least 
explored. An A. thaliana DUF581 containing protein, MEDIA- 
TOR OF ABA-REGULATED DORMANCY 1 (MARDl) was 
identified from senescence related enhancer-trapping and found to 
be involved in ABA-mediated seed dormancy and induced during 
senescence [8,9] . They also identified that MARD 1 possess a novel 
zinc finger domain suggesting the relation of DUF581 with zinc 
fingers of bacteria, archaea and metazoans [9]. A large scale 
protein-protein interaction study in A. thaliana identified many 
interacting proteins of DUF581 family proteins; however, the 
biological significance of these interactions remains to be explored 
[10]. 

DUFS 81 show high signature motif similarity with MYM-type 
Zinc finger with FCS sequence motif (zf-FCS). Zf-FCS is first 
identified in MYM family proteins which are related to 
myeloproliferative syndrome and mental retardation [11]. They 
are present in viruses, eubacteria, archaea, metazoa but not in 
plants. One FCS type zinc finger protein is present in brown algae 
Ectocarpus siliculosus. Zf-FCS is named after the conserved phenyl 
alanine and serine residues associated with the third cysteine. In 
metazoans, zf-FCS is largely present in Polycomb-group (PcG) of 
proteins. PcG proteins are developmental-regulator proteins which 
silence the expression of downstream proteins through chromatin- 
remodeling and epigenetic silencing. They form a multi-protein 
Polycomb Repressive Complex (PRC) which bind to the target 
gene and alter the epigenetic status of the gene [12]. PcG proteins 
are first identified in Dmsophila melanogaster for silencing the 
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expression of HOX genes which is important in proper 
embryonic-development [13]. They are highly conserved regula- 
tory proteins which play an important role in regulating 
developmental events in plants and animals [14]. Zf-FCS is found 
as single domain or in tandem cluster of up to 10 repeats. Only few 
studi(;s are done related to this domain which proved that it is a 
diverse class of zinc finger with variable functions. The single zf- 
FCS in Rae28, mouse homologue of D. melanogaster Polyhomeotic 
protein, interacts with RNA and DNA in non-sequence-specific 
manner [15]. Since Rae28 is involved in chromatin-remodeling, it 
is hypothesized that this zinc-finger may be involved in the binding 
of PRC complex to the target sequence. Later, it is found that the 
direct interaction of zf-FCS domain of Human Polyhomeotic 
Homologue 1 (HPHl/PHCl) with RNA is required for PHC- 
mediated repression of target genes [16]. Zf-FCS domain of 
human dSfmbt homologue L (3) MBT-like 2 (L3MBTL2) is a 
treble clef zinc finger similar to zinc fingers involved in protein- 
nucleic acid interaction [17]. These results suggest that zf-FCS is 
involved in protein-nucleic acid interaction. However, it is also 
reported that zf-FCS is involved in protein-protein interaction. It is 
found that the direct interaction among D. melanogaster PcG 
proteins, Scm-related protein containing four mbt domains 
(dSfmbt) and Sex comb on midleg (Scm) is mediat(-d by the zf- 
FCS domains present in both proteins. Both these proteins interact 
and cooperate synergisticaUy for mediating target gene repression 
[18]. All these reports shows that zf-FCS is a structurally diverse 
family which accommodate both nucleic-protein and protein- 
protein interaction zinc fingers. 

This study aims to characterize the function of DUF581 protein 
domain which is exclusive to plants. Using sensitive bioinformatic 
approaches, we confirmed that DUF581 is a zf-FCS like zinc 
finger domain. We named this plant specific domain as FCS-Like 
Zinc finger (FLZ). A genome wide surx^ey identified that FLZ 
domain has a bryophytic origin and this gene family is expanded 
in higher plants. Phylogenetic analysis of A. thaliana FLZ domain 
proteins and expression analysis of selected FL^ genes are done. 
Sequence and structure conservation studies identified that unlike 
the zf-FCS domain, FLZ domain is highly conserved. FLZ domain 
predicted to form a novel alpha-beta-alpha secondary structure 
pattern. A combination of bioinformatics and protein-protein 
interaction tools identified that FLZ acts as a protein-protein 
interaction module. 

Results 

DUF581 Domain Containing Proteins are Plant Specific 
FCS-Like Zinc Finger Proteins 

A genome wide survey was conducted in different databases to 
identify the members of DUF581 domain containing proteins 
from sequenced plant genomes. 33 1 members were identified from 
PFAM and 474 members were identified from InterPro [3,19]. 
Genes were also identified from Phytozome, Plaza, NCBI, 
Solanaceae Genomic Resource at Michigan state university. 
Tomato Genome Database at MIPS and ConGenlE [20-24]. 
Sequences were manualy curated to remove repeats and oudiers. 
The conservation at signature motif and structural conservation 
were verified. PFAM identified a DUF581 domain containing 
protein from a parasitic heterokont, Blastocjstis hominis; however, in 
our analysis we found that this domain lackedthe conserved alpha- 
beta-alpha structural pattern specific to the plant DUF581 
domain. A total of 757 non-redundant DUF581 genes were 
identified from 41 plant genomes (Table 1). DUF581 gene family 
is plant specific excluding algae. Search in Ostreococcm tauri, 0. 
lucimarinus, Micromonas sp. RCC299, Volvox carteri, Chkimydomonas 



reinhardtii genomes found no hits suggesting that DUF581 genes 
were absent in algae. Ail members of viridiplantae contains 
DUF581 domain containing genes. Physcomitrella patens genome 
contains 2 DUF581 genes suggesting a bryophytic origin of this 
gene family. Pteridophyte, Selaginella moelkndorffii also possess 2 
DUF581 genes. Spermatophytes show an increased content of 
DUF581 genes ranging from 9 members in Capsicum annum, Carica 
papaya, Aquikff,a caeruka and Lotus japonicus to 48 in Panicum virgatum. 
A detailed list of all DUF581 proteins identified in this study is 
given in Table SI. 

DUF581 and zf-FCS domain are members of TRASH clan of 
PFAM database and show very high similarity in sequence 
conservation (Figure SI). TRASH super family includes cysteine 
co-ordinated metal binding group of domains conserved both in 
prokaryotes and eukaryotes [25]. The other members of this super 
family include MYND, mitochondrial splicing suppressor 51, HIT 
zinc fingers, two DUF domains DUF2256 and DUF329, metal- 
binding domains archaeal TRASH domain, putative metal- 
binding domain of cation transport ATPase, YHS domain, and 
ribosomal protein L24e. All the members of TRASH clan shows 
varying degree of similarity in signature sequence motif (Figure 
SI). Sequence alignment between metazoan zf-FCS domains and 
DLJF58 1 domains from plants shows that they possess very similar 
consensus cysteine-signature sequence with conserved phenyl 
alanine and serine residue associated with third cysteine 
(Figure lA). Zf-FCS possess consensus C5C2C5C^^<.5qFCSX^2C' zinc 
finger motif while DUF581 shows identical CX2CXi7_i9FCSX2C 
motif In HMM-HMM comparison, both domains show a very 
similar alignment suggesting that both domains are nearly 
identical in signature sequence motif (Figure IB). The above 
results suggest that DUF581 is a zf-FCS like C2-C2 zinc finger. 
Based on these observations, we named DUF581 as FCS-Like 
Zinc finger (FLZ) domain. The proteins which possess this domain 
are named as FCS-like zinc finger (FLZ) proteins. 

The Arabidopsis FLZ Gene Family 

A. thaliana genome possesses 18 FLZ domain genes (Table 1). 
Except AT3G63230, all other genes have only single splice 
form while At3g63230 forms two splice variants. AT1G53885 
and AT1G53903 were found to be tandem duplicates and 
possess exactly same gene sequence. To understand the 
evolutionary relationship between individual members, a 
phylogram was constructed using the full length protein 
sequence of all FLZ proteins (Figure S2). The phylogram 
distinguished different clades of FLZ proteins. On the basis of 
their relation with FLZl observed in phylogram, all the other 
members were named. Among all the proteins, FLZ 16 and 
FLZ 17/ 18 showed most divergence from other members and 
formed individual distinct clades. Similarly, FLZ 15 also 
formed a distinct clade from other proteins. All other members 
were grouped in two big clades representing 7 members each 
in clade I and II. Few members in each clade were very closely 
positioned hinting the possible redundancy in their function. 
Redundancy in expression pattern and function is a common 
feature observed in many multigene families of A. thaliana 
[26,27]. Analysis of expression profile of three closely related 
members of FL^ gene family from clade I from publically 
available microarray data revealed that they show both distinct 
and overlapping expression pattern (Figure S3). The maximum 
expression oi FL^l was observed in the developing seeds. FL^2 
and FL^3 were also fairly expressed in different seed stages. 
Apart from seed stages, FL^l showed higher expression in 
imbibed seeds, stamens, carpels, and transition shoot apex 
while FL^2 is profusely expressed in cauline leaf, first node, 
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Table 1. Distribution of FLZ gene family in sequenced genomes. 





Taxonomic position 


Species 


Number of H.Z genes 


Bryophyta 


Physcomitrella patens 


2 


Pteridophyta 


Selaginetia moeltendorffii 


2 


Gymnosperms 


Picea abies 


23 


Dicots 


Arabidopsis thaliana 


18 


Arabidopsis lyrata 18 




Aquiiegia caerulea 


9 




Brassica rapa 


34 




Capsella rubella 


16 




Capsicum ar)num 


9 




Carica papaya 


9 




Cicer arietif^um 


15 




Citrus Clementina 


13 




Citrus cinensis 


13 




Cucumus sativus 


16 




Eucalyptus grandis 


19 




Fragaria vesca 


14 




Glycine max 


37 




Gossypium raimondii 


28 




Linum usitatissimum 


16 




Lotus japonicus 


9 




Malus domestica 


22 


Manihot esculenta 1 8 




Medicago truncatula 


12 




Mimulus gluttatus 


14 




Nicotiana tabacum 


25 




Phaseolus vulgaris 


19 




Populus trichocarpa 


21 




Prunus persica 


12 




Ricinus communis 


11 




Solanum lycopersicum 


15 




Solanum phujera 


15 




Thellungiella halophila 


16 




Theobroma cacao 


12 




Vitis vinifera 


10 


Monocots 


Brachypodium distachyon 


26 




Hordeum vulgare 


16 




Oryza sativa 


29 




Panicum virgatum 


48 




Setaria italica 


28 




Sorghum bicotor 


29 




Zea mays 


29 



doi:l 0.1 371 /journal.pone.0099074.t001 



and second internode and in different floral stages and organs. 
FL^3 had almost uniform expression pattern which profuse up 
regulation in 1"' node, 2"'' internode, cotyledon, and in 
different floral organs. FL^l, FL^2 and FL^3 were also 
showed higher expression in senescing leaves compared to 
rosette leaves. 



FLZ Domain is a Novel Zinc-finger Domain with a Highly 
Conserved Alpha-beta-alpha Secondary Structure Pattern 

FLZ domain predicted to have a highly conserved secondary 
structure pattern. It composed of an N-terminal short ot-helix, a 
beta hairpin followed by a longer C-terminal a-helix (Figure 2A). 
Interestingly, this kind of secondary structure pattern is not found 
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Figure 1. Alignment between FLZ and zf-FCS domain. (A) Multiple sequence alignment between FLZ and zf-FCS domains. Conserved cysteine, 
phenyl alanine and serine residues are marked by asterisks. (B) HMM-HMM alignment between FLZ domain and zf-FCS showing similarity in sequence 
conservation. 
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in any of the classified structural classes of zinc fingers [28]. 
Residue conservation analysis in the FLZ domain across plant 
kingdom showed that the four cystein residues are highly 
conserved along with signature phenyl alanine and serine residues 
associated with third cysteine (Figure 2B). It has a highly conserved 
a helix- P hairpin- a-helix sc-condary structure pattern as a result 
of conserved amino acids which favors the formation of a-hehx 
and (3-sheet at the specific regions. Alanine, cysteine, leucine, 
methionine, lysine, glutamine and histidine show high helbc 
forming propensity while tyrosine, valine, phenyl alanine, isoleu- 
cine, tryptophan, and threonine favor beta sheet [29,30]. The 
highly conserved phenyl alanine and lysine residues followed by 
fairly conserved aspartic acid and alanine along with the first 
cysteine and the following phenyl alanine contribute to the 
formation of the N-terminal short helix. In helices, glutamic acid, 
phenyl alanine and aspartic acid are found in larger frequencies 
than expected according to their helix-propensity [29]. The 
middle beta-sheet is formed by the conserx^ed isoleucine, phenyl 
alanine, methionine, and tyrosine residues. The larger C-terminal 
hehx is in the position of fourth cysteine associated with conserved 
glutamic acid and fairly conserved arginine, aspartic acid, and 
glutamine residues which generally favors helix formation. Along 
with the highly conserved cysteine residues, the fair conservation 
of the other residues resulted in a highly conserved topology of 
FLZ domain across the plant kingdom. 

Domain Organization and Distribution in FLZ Protein 

Family 

Domain distribution and organization of FLZ family proteins 
were analyzed by InterProScan [31]. Except three members, all 
other members contain no other functional domain other than 
FLZ, suggesting the pivotal role of FLZ domain in their function 
(Figure 3). In most cases, the single FLZ domain is situated near 
the C-terminal end of the protein. Two Fmgaria proteins contain 
other domains along with FLZ domain. F.ve mrna20323.1 
contains two Cupin (PF00190) domains while F.ve mrna01033.1 
contains an ion-transport protein domain (PF00520), a cyclic 
nucleotide-binding domain and DUF3354 (PFl 1834) along with a 
C-terminal FLZ domain. A FLZ protein in apple, 
MDPOOOO 136760, shows tandem pentatricopeptide repeats along 
with an N-terminal FLZ domain. 

FLZ Domain is Involved in Protein-protein Interaction 

Threading/fold recognition is helpful in identifying structural 
and functional aspects of novel folds even if they possess remote 
homology with characterized domains [32,33]. Threading of FLZ 
with Phyre revealed that it shows high fold similarity with LIM 
domains (Figure S4). LIM domains are zinc finger domains with 
two tandem zinc fingers. Each of these zinc fingers forms a treble- 
clef fold and participates in protein-protein interaction [34]. 
Threading of FLZ gave reliable predictions with a precision up to 
90% for LIM domains. This prompted us to speculate that FLZ 
might also be a protein-protein interaction zinc finger. 

To find out whether FLZ protein involved in protein-protein 
interaction, yeast-two-hybrid assay (Y2H) was conducted with an 
A. thaliana FLZ domain containing protein, AT5G47060. We 
namc-d this j)r()tcin as FGS-like Zinc Finger 1 (FLZl). 50 colonies 
si:reened to identify the interacting proteins and 4 genuine 
interacting proteins are identified. A list of all interacting proteins 
identified in this study is given in Table S2. To find out whether 
the FLZ domain of FLZl is involved in protein-protein 
interaction, deletion constructs of FL^l gene were generated 
(Figure 4B). The N terminal fragment corresponds to 1 to 88 
amino acids of the fuU length FLZ 1 protein while the FLZ domain 



corresponds to amino acids from 89 to 140. The C-terminal 
fragment comprised of amino acids from 141 to 177 of whole 
protein. We repeated the Y2H with deletion fragments of FLZl 
with PLANT AND FUNGI ATYPICAL DUAL-SPECIFICITY 
PHOSPHATASE 3 (PFA-DSP3) and SALT TOLERANCE 
HOMOLOG2 (STH2) which are earher found to be interacting 
with fuU-length FLZl (Figure 4A). In Y2H with deletion 
constructs, we found that only FLZ domain can mediate the 
protein-protein interaction with the prey proteins suggesting their 
role in protein-protein interaction (Figure 4C). In beta-galactosi- 
dase assay, FLZ domain showed nearly half strength of interaction 
compared to fuU length bait while N-terminal and C-terminal 
fragments showed very minimal enzyme activity proving that FLZ 
domain alone is responsible for interaction of FLZl with other 
proteins (Figure 4D, E). However, the strength of the interaction is 
reduced to almost half when FLZ domain alone interacted with 
prey proteins suggesting that the oth(-r parts of the protein may be 
helping in providing a strong interaction between both proteins. 

To confirm the results obtained from Y2H assay, we did BiFC 
assay of FLZ 1 and PFA-DSP3 interaction. In BiFC assay using 
onion epidermis system, it was found that both these proteins 
interact in the nucleolus (Figure 5A). Apart from its wide use as a 
DNA stain, DAPI is also used as a negative stain for nucleolus [35- 
37]. Negative staining of nucleolus with DAPI confirmed that both 
proteins interact exclusively in the nucleolus (Figure 5A). Further, 
we checked whether FLZ domain alone can mediate the 
interaction between FLZl and PFA-DSP3. As observed in the 
Y2H experiment, we found that FLZ domain is alone sufficient for 
the interaction of both these protein confirming the role of FLZ 
domain in protein-protein interaction (Figure 5B). To confirm the 
specificity of this interaction, we used another A. thaliana FLZ 
domain containing protein, AT5G49120 and checked whether it 
can interact with PFA-DSP3. It was found that AT5G49120 
cannot interact with PFA-DSP3 suggesting that the interaction is 
very specific to FLZl (Figure 5C). Normally, FLZl localizes in 
nucleus and cytoplasm while PFA-DSP3 localizes exclusively in 
nucleus (Figure 6). However, their interaction found to be 
exclusive to nucleolus suggesting a possible role in nucleolar 
function. 

Discussion 

In this study we identified FLZ domain containing proteins are 
identified from 41 plant species. They are completely absent in 
algae. The first report of FLZ domain proteins came from 
bryophyte, P. patens suggesting a bryophytic origin. In higher 
plants, the FL^ gene family is highly expanded. Most of the plants 
are paleopolyploids. Two whole genome dupUcation events 
happened before the diversification of seed plants expanded and 
diversified many of the regulatory gene families, especially genes 
which are related to flowering and seed development [38]. Gene 
families are evolved from segmental and tandem gene duplication 
of parent genes [39] . Most number of FL^ genes are found in the 
tetraploid genome of P. rar^afem API 3, implying the role of genome 
duplication in expansion of FL^ gene family. 

Analysis of evolutionary relationship between Arabidopsis FL^ 
proteins revealed the position of individual members inside the 
family. Expression profiling of three closely related members 
revealed an overlap in their expression domain suggesting the 
possible redundancy in function. In general, all three proteins were 
expressed in different floral organs, flower and seed developmental 
stages. FLJ(^1 was also expressed in transition shoot apex suggesting 
a role in regulating phase transition. In Y2H, we identified that 
FLZl interact with CONSTANS-LIKE 1 (COLl), which is a 
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--VKPGAGGGGGGRRLHFLESCFLCKSSIAGDR 
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Figure 2. Secondary structure pattern and sequence conservation of FLZ domain. (A) Secondary structure conservation of FLZ domain. 
Red color indicates alpha helix and blue color indicates beta-sheet. Confidence gradient of secondary structure formation is given on the top. (B) 
Sequence logo of Arabidopsis, Medicago, poplar and rice FLZ domains showing amino acid conservation. 
doi:1 0.1 371 /journal.pone.0099074.g002 
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Figure 3. Schematic representation of domain organization in FLZ protein family. The FLZ proteins are scanned by InterProScan to identify 
conserved domains. A. thaliana FLZ1 is shown as a representative model for proteins which contain FLZ domain only. Proteins which possess other 
domains along with FLZ are also shown. The domains are abbreviated as follows, FLZ (FCS like zinc finger, PF04570), PPR (Pentatricopeptide Repeat, 
PF01535), Cupin (Cupin 1, PF00190), ICF (Ion channel family, PF00520), CND (Cyclic nucleotide-binding domain, PF00027) and DUF3354 (Domain of 
unknown function 3354, PF11834). 
doi:1 0.1 371 /journal.pone.0099074.g003 



homologue of flowering time gene CONSTANS (CO). FLZl also 
interacts with STH2 which is mainly involved in light regulated 
development and shade avoidance [40,41]. We identified that 
FLZl interact with a dual specificity phosphatase, PFA-DSP3 in 
nucleolus. Identification of biological significance of these inter- 
actions can shed light to the possible role of FLZ 1 in difiFerent 
developmental stages. As like MAKDl, ail three genes analyzed in 
this study showed transcript accumulation in senescing leaves 
compared to rosette leaves suggesting the function of FL^ gene 
family in senescence. 

FL^ genes are a poorly studied class of gene family which is 
specific to plants. Early efforts in understanding the role of these 
genes identified that they are related to senescence and ABA 
mediated seed dormancy [8,9]. They are small proteins and 
almost all of them contain only a single FLZ functional domain. 
Decoding the function of FLZ is a key for the functional 
characterization of this family. From the individual functional 



characterization of DUF families and the co-ordinated work of 
NIH Protein Structure Initiative, it is found that most of the DUFs 
are the diverged members of the already characterized domains 
[4,7,42]. Taking this notion in account, the analysis of sequence 
conservation of FLZ domain clearly identified that they are highly 
related to zf-FCS. As in the case of zf-FGS, the phenyl alanine and 
serine residue associated with third cysteine is also fairly conserved 
in FLZ domain. The major difference between both these domains 
is in the length of the spacer region which connects the zinc 
repeats. The spacer region of zf-FCS is highly variable with 
residues from 14 to 30. However, the spacer region of FLZ is 
much conserved with residue variation from 17 to 19 only. It is 
already found that the spacer region of zinc fingers varies even 
among the members of the same class and the variation in the 
spacer region influences the function of the zinc finger [43,44]. It is 
evident that the divergent functions played by zf-FCS are because 
of the variation in the length of spacer region. This variation 




Figure 4. FLZ acts as tlie module for protein-protein interaction. (A) FLZl interacts with PFA-DSP3 and STH2 in Y2H. Murine p53 and SV40 
large T-antigen interaction taken as positive control and p53 and lamin interaction is taken as negative control. (B) The deletion constructs of FLZl, N- 
terminal (1-88 amino acids), FLZ (89-1 40 amino acids) and C-terminal (141-1 77 amino acids). (C) Y2H with deletion constructs of FLZl with PFA-DSP3 
and STH2 showing FLZ is essential for their interaction. (D) And (E) beta-galactosidase activity of full length FLZl and deletion constructs interaction 
with PFA-DSP3 and STH2. 
doi:1 0.1 371/journal.pone.0099074.g004 
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Figure 5. FLZ domain mediates the interaction of FLZ1 and 
PFA-DSP3. (A) BiFC, FLZ1 and PFA-DSP3 interact exclusively in 
nucleolus. Upper panel: whole cell, YFP alone, DAPI alone and merged. 
Lower panel: nucleus zoomed, YFP alone, DAPI alone and merged. (B) 
BiFC of PFA-DSP3 with deletion constructs of FLZ1: N-terminal (1-88 
amino acids), FLZ (89-140 amino acids) and C-terminal (141-177 amino 
acids). (C) BiFC of PFA-DSP3 with AT5G491 20: YFP alone, DAPI alone and 
merged. YFP were excited at 514 nm and emission was recorded at 
530 nm. DAPI were excited at 351 nm and emission was recorded at 
450 nm. 

doi:1 0.1 371 /journal.pone.0099074.g005 

resulted in different secondary structure pattern which makes zf- 
FCS as a multifunctional zinc finger class (Data not shown). 
However, in the case of FLZ domain, the variation in the spacer 
length is only two residues suggesting a highly conserved function 
across the species. 

In case of identifying the function of DUF, structure based 
approach is found to be more effective than sequence based 
search. The function of a protein domain is defined by the fold it 
forms, so during the course of evolution the structure is likely to be 
more conserved than the sequence [45]. Identification of the 



structure of the DUF and searching the close fold from already 
solved structures helped in identifying the function of many DUF 
domains [6,7,42]. Fold recognition can also be employed for 
identifying the homology of DUF with already solved structures. 
The fold recognition of FLZ domain identified that they are 
structurally very similar to LIM domain protein which is a protein- 
protein interaction zinc finger. Subsequently, we found that the 
FLZ domain of A. thaliana FLZ 1 protein is indispensable for its 
interaction with PFA-DSP3 and STH2. However, the strength of 
the interaction is reduced to half when FLZ domain alone 
interacted with PFA-DSP3 and STH2 which suggests that the 
other portions of the protein might be having a helping role in 
ensuring a tight interaction. Notably, the FLZ is not structurally 
similar to the protein-protein interaction zf-FCS domains of 
dSfmbt and Scm (Data not shown). All these results suggest that 
FLZ domain is a highly diverged group of plant specific zf-FCS 
which functions as a protein-protein interaction module. 

The analysis of secondary structure pattern identified that FLZ 
form an alpha-beta-alpha secondary structure pattern. Interest- 
ingly, this kind of secondary structure pattern is not reported in 
any classified zinc finger groups so far [28]. It is also observed that 
unlike zf-FCS domain, the FLZ domain is highly conserved in 
sequence and structure. Considering the conservation in structure 
and its relation with LIM domain, it is unlikely that FLZ domain 
also interact with nucleic acids as like some members of zf-FCS. 
The variation in the sequence and structure in the zf-FCS group 
must be the reason for their diverse functions such as nucleic acid 
binding and protein binding. A structure based classification of zf- 
FCS will be helpful to differentiate the functional subclasses and to 
understand the evolution of this divergence. 

In short, using a combination of bioinformatics and protein- 
protein interaction studies, we found that DUF581 is FCS-like 
zinc-finger which acts as module for protein-protein interaction. 
They possess a highly conserved and novel secondary structure 
pattern. FLZ domain containing proteins are plant specific and 
bryophytic in origin. Local and whole genome duplication resulted 
in the expansion of this gene family in higher plants. Expression 
analysis of selected A. thaliana FL^ gene family members showed 
an overlap in the expression domain. 

Materials and Methods 

Identification of FLZ Gene Family Members from Public 
Data Bases 

In this study, we identified FL^ iamily genes from 41 species of 
viridiplantae. Using the key word 'DUF58r, a search was 
performed in PFAM, PLAZA v 2.5 and Interpro [3,21,19]. Genes 
were also identified from Phytozome using PFAM identifier, 
PF04570 [20]. -Fi..^ genes from Solanaceae were identified from 
Solanaceae Genomic Resource using InterPro id IPR007650. 
Members from barley and Cicer arietinum were identified from 
NCBI BLASTp [22]. The Picea abies FL^ genes were identified 
from ConGenlE using BLASTp [24]. Protein sequence were 
downloaded and manually curated for repeats. Oudiers were 
removed using InterProScan and multiple sequence alignment 
using Clustal X 2.0 [31,46]. The structural conservation was 
analyzed using Ali2D [47] . 

Bioinformatics Tools Used 

For multiple sequence alignment, FLZ and zf-FCS domain 
sequences were retrieved from PFAM. They were aligned with 
Clustal X 2.0 and visuahzed using Mview [46,48] . Pair wise HMM 
logo comparison was done using LogoMat-P [49]. Fold recogni- 
tion of FLZ domain was done using Phyre v 0.2 [50]. Sequence 
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A 




Figure 6. Sub-cellular localization of FLZ1 and PFA-DSP3 in onion epidermal cells. (A) Vector alone. (B) FLZ1. (C) PFA-DSP3. Left to right, 
YFP only, bright field, merged. FLZl is localized in both nucleus and cytoplasm while PFA-DSP3 is exclusively localized in nucleus. YFP were excited at 
514 nm and emission was recorded at 530 nm. 
doi:1 0.1 371 /journal.pone.0099074.g006 



logo was generated using WebLogo [51]. The domam organiza- 
tion was drawn by PROSITE My Domains [52]. The phyloge- 
netic tree of Arabidopsis FL^ gene family was generated using 
MEGA 5 [53]. The expression graphs of ^Z^^ genes were obtained 
from Arabidopsis eFP browser [54]. 



Yeast Two-hybrid Assay 

Yeast two-hybrid assay was conducted using Matchmaker Gold 
Yeast two-hybrid System (Clontech, Mountain View, CA) 
according to manufacturer's protocol. FL^l was cloned in 
pGBKT? and used as a bait to screen normalized Mate & Plate 
Universal Arabidopsis Yeast two-hybrid cDNA library (Clontech, 
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Mountain View, CA). Tlie interaction of PFA-DSP3 and STH2 
was confirmed by cloning them in pGDAT7 and one-to-one 
interaction check with FLZl. pGBKT7-53 and pGADT7-T were 
used as positive control and pGBKT7-Lam and pGADT7-T were 
used as negative control for the experiments. Deletion constructs 
of FLZl was made in pGBKT7 and interaction was checked with 
pGDAT7-PFA-DSP3 and pGDAT7-STH2. The primers used for 
cloning are shown in Table S3. 

P-Galactosidase Assay 

Bait and prey proteins were co transformed in Y 187 yeast strain 
and P-Galactosidase assay was conducted according to the 
protocol of Yeast Protocols Handbook (Clontech, Mountain View, 
CA). The result was the average of three independent experiments. 

Bimolecular Fluorescent Complementation 

pSAT4-DEST-N (1-174) EYFP-Cl and pSAT5-DEST-C (175- 
END) EYFP-Cl vectors were used for BiFC [55]. FLZl CDS and 
deletion fragments and PFA-DSP3 were cloned in pCR8/GW/ 
TOPO vector and transferred to pSAT4-DEST-N (1-174) EYFP- 
Cl and pSAT5-DEST-C (175-END) EYFP-Cl vectors respec- 
tively using Gateway cloning technology (Invitrogen, CA). The 
primers used for cloning are shown in Table S3. BiFC was done in 
onion epidermal cells using PDS-1000 Helios Gene Gun (Biorad) 
[56]. Interaction was checked in TCS SP2 (AOBS) laser confocal 
scanning microscope (Leica Microsystems) 24 hours after bom- 
bardment. 

DAPI Staining 

Onion peels were subjected to DAPI staining before visualiza- 
tion in confocal scanning microscope. Onion peels were washed 
with PBS, pH 7.5 and stained with 15 |J,g/mL DAPI solution for 
30 minutes in dark. Peels were again washed with PBS, pH 7.5 
and visualized under confocal scanning microscope. 

Subcellular Localization Study 

Subcellular localization studies were done in onion epidermal 
cells. FLZl and PFA-DSP3 were cloned in pEG104 vector [57]. 
The constructs were bombarded in to onion peel using PDS-1000 



Helios Gene Gun (Biorad) [56]. The results were analyzed 24 
hours after bombardment under TCS SP2 (AOBS) laser confocal 
scanning microscope (Leica Microsystems). 

Supporting Information 
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Figure S2 Phylogenetic tree of .4. thaliana FLZ domain 
containing proteins. 

(PPT) 

Figure S3 Expression profile of 3 selected FLZ domain 
containing genes ot Arabidopsis. 
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Figure S4 Sample result of threading of FLZ domain 
using Phyre. 
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Table SI List of FLZ domain containing proteins in 
sequenced plant genomes. 
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Table S2 List of FLZl interacting proteins obtained in 
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